Take Home Exercise 3a: Modelling Geography of Financial Inclusion with Geographically Weighted Methods

Author

Jeffrey Lee Shao Lin

Published

October 28, 2024

Modified

November 15, 2024

1. Introduction

According to Wikipediafinancial inclusion is the availability and equality of opportunities to access financial services. It refers to processes by which individuals and businesses can access appropriate, affordable, and timely financial products and services - which include banking, loan, equity, and insurance products. It provides paths to enhance inclusiveness in economic growth by enabling the unbanked population to access the means for savings, investment, and insurance towards improving household income and reducing income inequality.

2. The Task

In this take-home exercise, we are required to build an explanatory model to determine factors affecting financial inclusion by using geographical weighted regression methods.

3. The Data

For the purpose of this take-home exercise, two data sets shall be used, they are:

4. Importing Packages

Before we start the exercise, we will need to import necessary R packages first. We will use the following packages:

  • olsrr package for building OLS and performing diagnostics tests

  • GWmodel package for calibrating geographical weighted family of models

  • corrplot package for multivariate data visualisation and analysis

  • sf package provides functions to manage, processing, and manipulate Simple Features, a formal geospatial data standard that specifies a storage and access model of spatial geometries such as points, lines, and polygons.

  • tmap which provides functions for plotting cartographic quality static point patterns maps or interactive maps by using leaflet API.

    Use the code chunk below to install and launch the below R packages.

pacman::p_load(olsrr, ggstatsplot, corrplot, ggpubr, sf, spdep, GWmodel, tmap, tidyverse, gtsummary, performance, see, sfdep)

5. Getting the Data Into R Environment

5.1 Importing geospatial data

The geospatial data used in this hands-on exercise is called geoBoundaries-UGA-ADM2. It is in ESRI shapefile format. The shapefile consists of Uganda district level boundaries. Polygon features are used to represent these geographic boundaries. The GIS data is in svy21 projected coordinates systems.

The code chunk below is used to import geoBoundaries-UGA-ADM2 shapefile by using st_read() of sf packages.

# Load district level boundary GIS data

boundaries2 <- st_read(dsn = "data/rawdata/geoBoundaries-UGA-ADM2-all", 
                layer = "geoBoundaries-UGA-ADM2")
Reading layer `geoBoundaries-UGA-ADM2' from data source 
  `C:\Users\user\OneDrive - Singapore Management University\MITB\6. Geospatial Analytics and Applications\jeffleesl\ISSS626-GAA\Take-Home_Ex\Take-Home_Ex03\data\rawdata\geoBoundaries-UGA-ADM2-all' 
  using driver `ESRI Shapefile'
Simple feature collection with 151 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 29.56838 ymin: -1.4732 xmax: 35.02676 ymax: 4.228399
Geodetic CRS:  WGS 84

5.1.1 Updating CRS Information

Uganda is located in southeast Africa between 1º S and 4º N latitude, and between 30º E and 35º E longitude.

The code chunk below updates the newly imported mpsz with the correct ESPG code (i.e. 32736 or 21096).

# Transform to the correct ESPG Code

boundaries <- st_transform(boundaries2, 32736)
# Verify the newly transformed boundaries

st_crs(boundaries)
Coordinate Reference System:
  User input: EPSG:32736 
  wkt:
PROJCRS["WGS 84 / UTM zone 36S",
    BASEGEOGCRS["WGS 84",
        ENSEMBLE["World Geodetic System 1984 ensemble",
            MEMBER["World Geodetic System 1984 (Transit)"],
            MEMBER["World Geodetic System 1984 (G730)"],
            MEMBER["World Geodetic System 1984 (G873)"],
            MEMBER["World Geodetic System 1984 (G1150)"],
            MEMBER["World Geodetic System 1984 (G1674)"],
            MEMBER["World Geodetic System 1984 (G1762)"],
            MEMBER["World Geodetic System 1984 (G2139)"],
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ENSEMBLEACCURACY[2.0]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4326]],
    CONVERSION["UTM zone 36S",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",33,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Navigation and medium accuracy spatial referencing."],
        AREA["Between 30°E and 36°E, southern hemisphere between 80°S and equator, onshore and offshore. Burundi. Eswatini (Swaziland). Kenya. Malawi. Mozambique. Rwanda. South Africa. Tanzania. Uganda. Zambia. Zimbabwe."],
        BBOX[-80,30,0,36]],
    ID["EPSG",32736]]
st_bbox(boundaries) #view extent
      xmin       ymin       xmax       ymax 
  117997.3  9836930.8   725449.1 10467443.7 
tm_shape(boundaries) +
  tm_polygons()

## Convert to multipolygon to individual polygon
boundaries_sf <- boundaries %>% 
  st_cast("POLYGON") %>% 
  mutate(area = st_area(.))
Warning in st_cast.sf(., "POLYGON"): repeating attributes for all
sub-geometries for which they may not be constant
## Group by the unique name and select the largest polygon by area
boundaries_cleaned <- boundaries_sf %>% 
  group_by(shapeName) %>% 
  filter(area == max(area)) %>% 
  ungroup() %>% 
  select(-area) %>% 
  select(shapeName) %>% 
  rename(
  county_name = shapeName 
  )
tm_shape(boundaries_cleaned) +
  tm_polygons()

5.2 Importing the aspatial data, FinScope Uganda

The FinScope-2023_Dataset_Final is in csv file format. The codes chunk below uses read_csv() function of readr package to import FinScope-2023_Dataset_Final into R as a tibble data frame called uganda_data.

uganda_data <- read_csv("data/rawdata/FinScope-2023_Dataset_Final.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 3176 Columns: 686
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (25): HH_ID, Interview_ID, ea_name, District, Region, Subregion, Rural_...
dbl (638): ea_code, age, disabled, Pweight, Lhhid, Enum_code, InterviewDate,...
lgl  (23): f3_2_14, f3_2_15, f3_3_14, f3_3_15, f3_5_14, f3_5_15, g12_2_1, g1...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

5.2.1 Variables to Consider for Financial Inclusion

Check the column names in the uganda_data to identify the right names.

colnames(uganda_data) # Displays all column names in the dataset

To determine factors affecting financial inclusion, consider including the following types of variables:

  • Age and Age Band

  • Gender

  • Education Level

  • Mobile User

  • Income Level

  • Employment Status

  • Urban vs. rural status
  • Distance to nearest bank or financial institution from Home (Commercial Bank, SACCO and Mobile Money)

  • Distance to nearest ATM from Home

  • Financial Advice

  • Save Money and the channel (Commerical Nank, SACCO and Mobile Money)

  • Last amount saved

  • Borrow Money and the channel (Commerical Nank, SACCO and Mobile Money)

  • Last amount borrowed

  • Last amount sent

  • Last amount received

  • Documentation for KYC (National Identification Card, Passport, Utilities and Pay Slip)

  • Self Sustaining

5.2.1.1 Rename the Variables

uganda_data_rename <- uganda_data %>%
  select(-c(2:7, 9, 11:17, 20, 23:28, 30:34, 36:37, 40:43, 45:63, 65, 67:90, 93:94, 96, 98:167, 169:230, 232:234, 236:238, 240:241, 243:342, 344:364, 366:384, 386:438, 440:444, 448:473, 476:674, 677:679, 681:686)) %>% 
  rename(
    age_band = c1,
    gender = c2,
    education_level = c4,
    employment_status = c5,
    mobile_user = c7_1_1,
    national_ic_doc = c8_1a,
    passport_doc = c8_1d,
    utilities_bill_doc = c8_1e,
    pay_slip_doc = c8_1j,
    self_sustaining = e1_1,
    financial_advice = e3_1,
    save_money = f2_1,
    save_money_commercial_bank = f3_1_1,
    save_money_SACCO = f3_1_4,
    save_money_mobile_money = f3_1_6,
    last_amt_saved = f6_1,
    last_amt_borrowed = g3_3,
    borrow_money_commercial_bank = g6_1_1,
    borrow_money_SACCO = g6_1_5,
    borrow_money_mobile_money = g6_1_8,
    last_amt_sent = hpp3_2,
    last_amt_received = hpp6_2,
    own_insurance = j1,
    distance_commerical_bank = k1_1_1,
    distance_SACCOS = k1_1_7,
    distance_ATM = k1_1_8,
    distance_mobile_money = k1_1_9,
    savings_account = kcb1_1_1,
    joint_account = kcb1_1_2, 
    latitude = hh_gps_latitude,
    longitude = hh_gps_longitude,
    county_name = s1aq2b
  )

5.2.1.1 Clean the Variables

uganda_data_new <- uganda_data_rename %>%
  filter(!is.na(longitude) & longitude != "",
         !is.na(latitude) & latitude != "") %>%
  replace_na(list(
    save_money_commercial_bank = 2,
    save_money_SACCO = 2,
    save_money_mobile_money = 2,
    last_amt_saved = 9,
    last_amt_borrowed = 998,
    borrow_money_commercial_bank = 2,
    borrow_money_SACCO = 2,
    borrow_money_mobile_money = 2,
    last_amt_sent = 998,
    last_amt_received = 998
  )) %>%
  mutate(across(c(savings_account, joint_account), 
                ~ if_else(is.na(.) | . == "", 2, .)))
head(uganda_data_new$longitude) #see the data in XCOORD column
[1] 33.65414 33.65328 33.65403 33.65586 33.65472 33.65549
head(uganda_data_new$latitude) #see the data in YCOORD column
[1] 2.677662 2.675690 2.673339 2.671343 2.671923 2.672423

Next, summary() of base R is used to display the summary statistics of uganda_data_new tibble data frame.

summary(uganda_data_new)
    HH_ID              Region          Rural_Urban           age_band    
 Length:3176        Length:3176        Length:3176        Min.   :1.000  
 Class :character   Class :character   Class :character   1st Qu.:3.000  
 Mode  :character   Mode  :character   Mode  :character   Median :4.000  
                                                          Mean   :3.986  
                                                          3rd Qu.:5.000  
                                                          Max.   :7.000  
     gender      education_level employment_status  mobile_user   
 Min.   :1.000   Min.   :1.000   Min.   : 1.00     Min.   :1.000  
 1st Qu.:1.000   1st Qu.:2.000   1st Qu.: 1.00     1st Qu.:1.000  
 Median :2.000   Median :3.000   Median : 2.00     Median :1.000  
 Mean   :1.552   Mean   :3.169   Mean   : 3.67     Mean   :1.273  
 3rd Qu.:2.000   3rd Qu.:4.000   3rd Qu.: 5.00     3rd Qu.:2.000  
 Max.   :2.000   Max.   :9.000   Max.   :99.00     Max.   :2.000  
 national_ic_doc  passport_doc  utilities_bill_doc  pay_slip_doc  
 Min.   :1.000   Min.   :1.00   Min.   :1.000      Min.   :1.000  
 1st Qu.:1.000   1st Qu.:2.00   1st Qu.:2.000      1st Qu.:2.000  
 Median :1.000   Median :2.00   Median :2.000      Median :2.000  
 Mean   :1.171   Mean   :1.96   Mean   :1.926      Mean   :1.964  
 3rd Qu.:1.000   3rd Qu.:2.00   3rd Qu.:2.000      3rd Qu.:2.000  
 Max.   :2.000   Max.   :2.00   Max.   :2.000      Max.   :2.000  
 self_sustaining financial_advice   save_money   save_money_commercial_bank
 Min.   :1.000   Min.   :1.000    Min.   :1.00   Min.   :1.000             
 1st Qu.:2.000   1st Qu.:1.000    1st Qu.:1.00   1st Qu.:2.000             
 Median :2.000   Median :1.000    Median :1.00   Median :2.000             
 Mean   :1.846   Mean   :1.401    Mean   :1.36   Mean   :1.896             
 3rd Qu.:2.000   3rd Qu.:2.000    3rd Qu.:2.00   3rd Qu.:2.000             
 Max.   :2.000   Max.   :2.000    Max.   :2.00   Max.   :2.000             
 save_money_SACCO save_money_mobile_money last_amt_saved  last_amt_borrowed
 Min.   :1.000    Min.   :1.000           Min.   :1.000   Min.   :  1.0    
 1st Qu.:2.000    1st Qu.:1.000           1st Qu.:1.000   1st Qu.:  3.0    
 Median :2.000    Median :2.000           Median :3.000   Median :998.0    
 Mean   :1.899    Mean   :1.738           Mean   :4.646   Mean   :626.1    
 3rd Qu.:2.000    3rd Qu.:2.000           3rd Qu.:9.000   3rd Qu.:998.0    
 Max.   :2.000    Max.   :2.000           Max.   :9.000   Max.   :998.0    
 borrow_money_commercial_bank borrow_money_SACCO borrow_money_mobile_money
 Min.   :1.000                Min.   :1.000      Min.   :1.000            
 1st Qu.:2.000                1st Qu.:2.000      1st Qu.:2.000            
 Median :2.000                Median :2.000      Median :2.000            
 Mean   :1.983                Mean   :1.978      Mean   :1.971            
 3rd Qu.:2.000                3rd Qu.:2.000      3rd Qu.:2.000            
 Max.   :2.000                Max.   :2.000      Max.   :2.000            
 last_amt_sent   last_amt_received own_insurance   distance_commerical_bank
 Min.   :  1.0   Min.   :  1.0     Min.   :1.000   Min.   :1.000           
 1st Qu.:  1.0   1st Qu.:  1.0     1st Qu.:2.000   1st Qu.:2.000           
 Median :998.0   Median :997.0     Median :2.000   Median :4.000           
 Mean   :582.9   Mean   :510.8     Mean   :1.974   Mean   :3.176           
 3rd Qu.:998.0   3rd Qu.:998.0     3rd Qu.:2.000   3rd Qu.:4.000           
 Max.   :998.0   Max.   :998.0     Max.   :2.000   Max.   :4.000           
 distance_SACCOS  distance_ATM   distance_mobile_money savings_account
 Min.   :1.000   Min.   :1.000   Min.   :1.000         Min.   :1.000  
 1st Qu.:2.000   1st Qu.:2.000   1st Qu.:1.000         1st Qu.:2.000  
 Median :2.000   Median :4.000   Median :1.000         Median :2.000  
 Mean   :2.508   Mean   :3.152   Mean   :1.655         Mean   :1.963  
 3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:2.000         3rd Qu.:2.000  
 Max.   :4.000   Max.   :4.000   Max.   :4.000         Max.   :2.000  
 joint_account    latitude         longitude     county_name       
 Min.   :1     Min.   :-1.4128   Min.   : 0.00   Length:3176       
 1st Qu.:2     1st Qu.: 0.2393   1st Qu.:30.99   Class :character  
 Median :2     Median : 0.7726   Median :32.54   Mode  :character  
 Mean   :2     Mean   : 0.9945   Mean   :31.53                     
 3rd Qu.:2     3rd Qu.: 1.9143   3rd Qu.:33.52                     
 Max.   :2     Max.   : 3.6876   Max.   :34.96                     

5.2.2 Convert to Percentage and Log-Transformations

This will help to perform the geographical weighted regression methods later.

uganda_data_fin <- uganda_data_new %>%
  mutate(
    LOG_age_band = log(age_band),
    gender_pct = gender / 3176 * 100,
    LOG_education_level = log(education_level),
    LOG_employment_status = log(employment_status),
    mobile_user_pct = mobile_user / 3176 * 100,
    national_ic_doc_pct = national_ic_doc / 3176 * 100,
    passport_doc_pct = passport_doc / 3176 * 100,
    utilities_bill_do_pct = utilities_bill_doc / 3176 * 100,
    pay_slip_doc_pct = pay_slip_doc / 3176 * 100,
    self_sustaining_pct = self_sustaining / 3176 * 100,
    financial_advice_pct = financial_advice / 3176 * 100,
    save_money_pct = save_money / 3176 * 100,
    save_money_commercial_bank_pct = save_money_commercial_bank / 3176 * 100,
    save_money_SACCO_pct = save_money_SACCO / 3176 * 100,
    save_money_mobile_money_pct = save_money_mobile_money / 3176 * 100,
    LOG_last_amt_saved = log(last_amt_saved),
    LOG_last_amt_borrowed = log(last_amt_borrowed),
    borrow_money_commercial_bank_pct = borrow_money_commercial_bank / 3176 * 100,
    borrow_money_SACCO_pct = borrow_money_SACCO / 3176 * 100,
    borrow_money_mobile_money_pct = borrow_money_mobile_money / 3176 * 100,
    LOG_last_amt_sent = log(last_amt_sent),
    LOG_last_amt_received = log(last_amt_received),
    own_insurance_pct = own_insurance / 3176 * 100,
    LOG_distance_commerical_bank = log(distance_commerical_bank),
    LOG_distance_SACCOS = log(distance_SACCOS),
    LOG_distance_ATM = log(distance_ATM),
    LOG_distance_mobile_money = log(distance_mobile_money),
    savings_account_pct = savings_account / 3176 * 100,
    joint_account_pct = joint_account / 3176 * 100 
  )

5.3 Converting aspatial data frame into a sf object

Currently, the uganda_data_new tibble data frame is aspatial. We will convert it to a sf object. The code chunk below converts uganda_data_new data frame into a simple feature data frame by using st_as_sf() of sf packages.

uganda_data.sf <- st_as_sf(uganda_data_fin,
                           coords = c("longitude", "latitude"),
                           crs=4326) %>%
  st_transform(crs=32736) 

Notice that st_transform() of sf package is used to convert the coordinates from wgs84 (i.e. crs:4326) to Arc 1960 (i.e. crs=32736).

Next, head() is used to list the content of uganda_data.sf object.

head(uganda_data.sf)
Simple feature collection with 6 features and 62 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 572616.1 ymin: 10295290 xmax: 572903.6 ymax: 10295980
Projected CRS: WGS 84 / UTM zone 36S
# A tibble: 6 × 63
  HH_ID  Region   Rural_Urban age_band gender education_level employment_status
  <chr>  <chr>    <chr>          <dbl>  <dbl>           <dbl>             <dbl>
1 001001 NORTHERN Urban              4      2               6                 1
2 001019 NORTHERN Urban              4      2               2                 5
3 001028 NORTHERN Urban              3      2               1                 5
4 001037 NORTHERN Urban              4      1               2                 1
5 001040 NORTHERN Urban              4      2               3                 4
6 001047 NORTHERN Urban              1      1               2                 9
# ℹ 56 more variables: mobile_user <dbl>, national_ic_doc <dbl>,
#   passport_doc <dbl>, utilities_bill_doc <dbl>, pay_slip_doc <dbl>,
#   self_sustaining <dbl>, financial_advice <dbl>, save_money <dbl>,
#   save_money_commercial_bank <dbl>, save_money_SACCO <dbl>,
#   save_money_mobile_money <dbl>, last_amt_saved <dbl>,
#   last_amt_borrowed <dbl>, borrow_money_commercial_bank <dbl>,
#   borrow_money_SACCO <dbl>, borrow_money_mobile_money <dbl>, …

Notice that the output is in point feature data frame.

6. Exploratory Data Analysis (EDA)

Use statistical graphics functions of ggplot2 package to perform EDA

6.1 EDA using statistical graphics

Plot the distribution of accounts by using appropriate Exploratory Data Analysis (EDA) as shown in the code chunk below.

ggplot(data=uganda_data.sf, aes(x=`savings_account_pct`)) +
  geom_histogram(bins=20, color="black", fill="light blue")

ggplot(data=uganda_data.sf, aes(x=`save_money_mobile_money_pct`)) +
  geom_histogram(bins=20, color="#0B2130", fill="#AB88BA")

6.2 Multiple Histogram Plots distribution of variables

Draw a few multiple histograms (also known as trellis plot) by using ggarrange() of ggpubr package to analysis the variables.

LOG_age_band <- ggplot(data=uganda_data.sf, aes(x= `LOG_age_band`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")
  
gender_pct <- ggplot(data=uganda_data.sf, aes(x= `gender_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")
  
LOG_education_level <- ggplot(data=uganda_data.sf, aes(x= `LOG_education_level`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")
    
LOG_employment_status <- ggplot(data=uganda_data.sf, aes(x= `LOG_employment_status`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")  
    
mobile_user_pct <- ggplot(data=uganda_data.sf, aes(x= `mobile_user_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 
    
national_ic_doc_pct <- ggplot(data=uganda_data.sf, aes(x= `national_ic_doc_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 

passport_doc_pct <- ggplot(data=uganda_data.sf, aes(x= `passport_doc_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 

utilities_bill_do_pct <- ggplot(data=uganda_data.sf, aes(x= `utilities_bill_do_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 
    
pay_slip_doc_pct <- ggplot(data=uganda_data.sf, aes(x= `pay_slip_doc_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 

self_sustaining_pct <- ggplot(data=uganda_data.sf, aes(x= `self_sustaining_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 

financial_advice_pct <- ggplot(data=uganda_data.sf, aes(x= `financial_advice_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")    
    
save_money_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")    
    
save_money_commercial_bank_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_commercial_bank_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166")   
    
save_money_SACCO_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_SACCO_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 

save_money_mobile_money_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_mobile_money_pct`)) + 
  geom_histogram(bins=20, color="black", fill="#FFC166") 

ggarrange(LOG_age_band, gender_pct, LOG_education_level, LOG_employment_status, 
          mobile_user_pct, national_ic_doc_pct, passport_doc_pct, utilities_bill_do_pct, 
          pay_slip_doc_pct, self_sustaining_pct, financial_advice_pct, save_money_pct, 
          save_money_commercial_bank_pct, save_money_SACCO_pct, save_money_mobile_money_pct, 
          ncol = 3, nrow = 5)

LOG_last_amt_saved <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_saved`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F") 

LOG_last_amt_borrowed <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_borrowed`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

borrow_money_commercial_bank_pct <- ggplot(data=uganda_data.sf, aes(x= `borrow_money_commercial_bank_pct`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
    
borrow_money_SACCO_pct <- ggplot(data=uganda_data.sf, aes(x= `borrow_money_SACCO_pct`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

borrow_money_mobile_money_pct <- ggplot(data=uganda_data.sf, aes(x= `borrow_money_mobile_money_pct`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

LOG_last_amt_sent <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_sent`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

LOG_last_amt_received <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_received`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

own_insurance_pct <- ggplot(data=uganda_data.sf, aes(x= `own_insurance_pct`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

LOG_distance_commerical_bank <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_commerical_bank`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

LOG_distance_SACCOS <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_SACCOS`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

LOG_distance_ATM <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_ATM`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

LOG_distance_mobile_money <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_mobile_money`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

savings_account_pct <- ggplot(data=uganda_data.sf, aes(x= `savings_account_pct`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

joint_account_pct <- ggplot(data=uganda_data.sf, aes(x= `joint_account_pct`)) + 
  geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")

ggarrange(LOG_last_amt_saved, LOG_last_amt_borrowed, borrow_money_commercial_bank_pct, borrow_money_SACCO_pct, 
          borrow_money_mobile_money_pct, LOG_last_amt_sent, LOG_last_amt_received, own_insurance_pct, 
          LOG_distance_commerical_bank, LOG_distance_SACCOS, LOG_distance_ATM, LOG_distance_mobile_money, 
          savings_account_pct, joint_account_pct,
          ncol = 4, nrow = 4)

The plots show that the majority of people lack insurance, savings, and joint accounts. Many respondents also feel they do not have enough money and would like to seek financial advice. Additionally, a large portion of those surveyed report that bank branches and ATMs are located relatively far from their homes.

7. Correlation Analysis - ggstatsplot methods

ggcorrmat(uganda_data_fin[, 36:62])

8. Hedonic Pricing Modelling in R

8.1 Simple Linear Regression Method

Build a simple linear regression model by using savings_account_pct as the dependent variable and LOG_distance_commerical_bank as the independent variable.

uganda.slr <- lm(formula=savings_account_pct ~ LOG_distance_commerical_bank, data = uganda_data.sf)
summary(uganda.slr)

Call:
lm(formula = savings_account_pct ~ LOG_distance_commerical_bank, 
    data = uganda_data.sf)

Residuals:
       Min         1Q     Median         3Q        Max 
-0.0309296  0.0005565  0.0005565  0.0019347  0.0033128 

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  0.0596595  0.0002987 199.745  < 2e-16 ***
LOG_distance_commerical_bank 0.0019882  0.0002574   7.723 1.51e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.005854 on 3174 degrees of freedom
Multiple R-squared:  0.01845,   Adjusted R-squared:  0.01814 
F-statistic: 59.65 on 1 and 3174 DF,  p-value: 1.509e-14

The output report reveals that the SELLING_PRICE can be explained by using the formula:

      *y = 0.0596595 + 0.0019882x1*

The R-squared of 0.01845 reveals that the simple regression model built is able to explain about 1.845% of the percentage of having savings account.

Since p-value is way bigger than 0.0001, we will not reject the null hypothesis that mean is not a good estimator of percentage of having savings account.

8.2 Multiple Linear Regression Method

The code chunk below using lm() to calibrate the multiple linear regression model.

sa_mlr <- lm(formula = savings_account_pct ~ LOG_age_band + gender_pct + LOG_education_level + LOG_employment_status + mobile_user_pct + national_ic_doc_pct + passport_doc_pct + utilities_bill_do_pct + pay_slip_doc_pct + self_sustaining_pct + financial_advice_pct +save_money_pct + save_money_commercial_bank_pct + save_money_SACCO_pct + save_money_mobile_money_pct + LOG_last_amt_saved + LOG_last_amt_borrowed + borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + borrow_money_mobile_money_pct + LOG_last_amt_sent + LOG_last_amt_received + LOG_distance_commerical_bank + LOG_distance_SACCOS + LOG_distance_ATM + LOG_distance_mobile_money + joint_account_pct, 
                data=uganda_data.sf)
summary(sa_mlr)

Call:
lm(formula = savings_account_pct ~ LOG_age_band + gender_pct + 
    LOG_education_level + LOG_employment_status + mobile_user_pct + 
    national_ic_doc_pct + passport_doc_pct + utilities_bill_do_pct + 
    pay_slip_doc_pct + self_sustaining_pct + financial_advice_pct + 
    save_money_pct + save_money_commercial_bank_pct + save_money_SACCO_pct + 
    save_money_mobile_money_pct + LOG_last_amt_saved + LOG_last_amt_borrowed + 
    borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
    borrow_money_mobile_money_pct + LOG_last_amt_sent + LOG_last_amt_received + 
    LOG_distance_commerical_bank + LOG_distance_SACCOS + LOG_distance_ATM + 
    LOG_distance_mobile_money + joint_account_pct, data = uganda_data.sf)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.032541 -0.000754 -0.000117  0.000530  0.016876 

Coefficients:
                                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)                       2.212e-02  1.009e-02   2.193  0.02841 *  
LOG_age_band                     -1.489e-04  2.251e-04  -0.661  0.50854    
gender_pct                       -6.663e-03  5.831e-03  -1.143  0.25325    
LOG_education_level              -1.725e-04  1.855e-04  -0.930  0.35259    
LOG_employment_status             6.847e-05  1.123e-04   0.610  0.54194    
mobile_user_pct                  -7.065e-03  7.521e-03  -0.939  0.34755    
national_ic_doc_pct              -1.000e-03  8.438e-03  -0.119  0.90567    
passport_doc_pct                  4.086e-02  1.547e-02   2.642  0.00828 ** 
utilities_bill_do_pct             2.092e-02  1.186e-02   1.763  0.07792 .  
pay_slip_doc_pct                  4.974e-02  1.600e-02   3.110  0.00189 ** 
self_sustaining_pct               2.396e-02  8.176e-03   2.931  0.00341 ** 
financial_advice_pct              2.178e-03  6.152e-03   0.354  0.72338    
save_money_pct                   -6.590e-02  1.206e-02  -5.463 5.04e-08 ***
save_money_commercial_bank_pct    1.816e-01  1.099e-02  16.530  < 2e-16 ***
save_money_SACCO_pct              2.195e-01  1.037e-02  21.158  < 2e-16 ***
save_money_mobile_money_pct      -1.070e-02  7.546e-03  -1.418  0.15632    
LOG_last_amt_saved                5.387e-04  1.739e-04   3.098  0.00197 ** 
LOG_last_amt_borrowed             4.239e-05  3.228e-05   1.313  0.18920    
borrow_money_commercial_bank_pct  8.934e-02  2.279e-02   3.920 9.03e-05 ***
borrow_money_SACCO_pct            9.115e-02  2.028e-02   4.494 7.23e-06 ***
borrow_money_mobile_money_pct     4.941e-02  1.743e-02   2.835  0.00461 ** 
LOG_last_amt_sent                 5.258e-05  3.416e-05   1.539  0.12381    
LOG_last_amt_received            -2.109e-05  3.265e-05  -0.646  0.51834    
LOG_distance_commerical_bank      1.280e-03  4.694e-04   2.727  0.00643 ** 
LOG_distance_SACCOS              -4.183e-05  2.315e-04  -0.181  0.85664    
LOG_distance_ATM                 -6.954e-04  4.648e-04  -1.496  0.13473    
LOG_distance_mobile_money        -4.169e-06  2.132e-04  -0.020  0.98440    
joint_account_pct                -6.535e-02  1.570e-01  -0.416  0.67730    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.004901 on 3148 degrees of freedom
Multiple R-squared:  0.3177,    Adjusted R-squared:  0.3118 
F-statistic: 54.28 on 27 and 3148 DF,  p-value: < 2.2e-16

The Multiple R-squared is 0.3177, suggesting that around 31.7% of the variability in savings_account_pct is explained by the predictor variables in the model. The Adjusted R-squared is 0.3118, which accounts for the number of predictors, confirming that the model provides a modest fit.

Whereas, The F-statistic of 54.28 with a very low p-value (< 2.2e-16) indicates that the model is statistically significant overall, meaning at least one of the predictors significantly impacts the savings_account_pct.

Key Predictors

The coefficients tell us the direction and strength of each predictor’s relationship with savings_account_pct. Here are some significant predictors:

  1. Passport Documentation (passport_doc_pct, p = 0.00828):

    • Positive relationship (Estimate = 0.0409): A higher percentage of individuals with passport documentation is associated with an increase in savings_account_pct.
  2. Payslip Documentation (pay_slip_doc_pct, p = 0.00189):

    • Positive relationship (Estimate = 0.0497): A higher percentage of individuals with pay slip documentation correlates with a higher percentage of savings accounts. This might suggest that stable income documentation positively influences savings account ownership.
  3. Self-Sustaining Percentage (self_sustaining_pct, p = 0.00341):

    • Positive relationship (Estimate = 0.0239): This suggests that communities with higher self-sustaining individuals are more likely to have savings accounts.
  4. Saving with Commercial Banks and SACCOs:

    • Save Money
      (save_money_pct,p = 5.04e-08, Estimate = -0.06590): has a negative coefficient, implying that higher rates of people saving money in general are associated with a lower savings account percentage, possibly indicating other informal

    • Save Money with Commercial Bank Percentage (save_money_commercial_bank_pct, p < 2e-16, Estimate = 0.1816): Strong positive relationship, indicating that those who save with commercial banks are highly likely to have a savings account.

    • Save Money with SACCO Percentage (save_money_SACCO_pct, p < 2e-16, Estimate = 0.2195): Also a strong positive relationship, reinforcing that participation in SACCOs (Savings and Credit Cooperative Organizations) is a strong indicator of savings account ownership.

  5. Borrowing from Commercial Banks and SACCOs:

    • Borrow Money from Commercial Bank Percentage (borrow_money_commercial_bank_pct, p = 9.03e-05, Estimate = 0.0893): Indicates a positive association with savings account ownership.

    • Borrow Money from SACCO Percentage (borrow_money_SACCO_pct, p = 7.23e-06, Estimate = 0.0912): Similar positive relationship, suggesting that access to borrowing services is linked to having savings accounts.

  6. Distance to Commercial Bank (LOG_distance_commercial_bank, p = 0.00643):

    • Positive relationship (Estimate = 0.0013): As the log-distance to commercial banks increases, there’s a small but significant increase in the savings account percentage, which might suggest limited access drives individuals to hold savings accounts if they are already banked.
  7. Last Amount Saved (LOG_last_amt_saved, p = 0.00197):

    • Positive relationship (Estimate = 0.00054): The log of the last amount saved has a slight positive effect on savings account ownership, suggesting that recent saving activity is associated with having a savings account.

Non-significant Predictors

Some predictors, such as gender_pct, LOG_education_level, and LOG_age_band, show non-significant effects (high p-values). This may suggest that demographic factors like age, gender, and education do not directly impact the likelihood of savings account ownership in this context.

  • Documentation (like passports and pay slips) and income stability appear to be important indicators of savings account ownership.

  • Engagement with formal and semi-formal financial services, such as commercial banks and SACCOs, positively influences the likelihood of having a savings account.

  • Access to borrowing services is also positively linked to savings account ownership, suggesting that people who have access to credit may be more financially integrated.

  • Distance to financial services can have a minor influence, suggesting a possible need for financial services closer to communities to further increase account ownership rates.

In conclusion, the model suggests that factors related to financial habits (saving and borrowing with formal institutions) and access to financial documentation have significant impacts on the likelihood of savings account ownership in Uganda.

9. Preparing Publication Quality Table: olsrr method

With reference to the report above, it is clear that not all the independent variables are statistically significant. We will revised the model by removing those variables which are not statistically significant.

Now, we are ready to calibrate the revised model by using the code chunk below.

sa_mlr1 <- lm(formula = savings_account_pct ~ passport_doc_pct + pay_slip_doc_pct  + self_sustaining_pct + save_money_pct + save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
                data=uganda_data.sf)
ols_regress(sa_mlr1)
                          Model Summary                            
------------------------------------------------------------------
R                       0.560       RMSE                    0.005 
R-Squared               0.314       MSE                     0.000 
Adj. R-Squared          0.312       Coef. Var               7.928 
Pred R-Squared          0.298       AIC                -24754.869 
MAE                     0.002       SBC                -24676.045 
------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                ANOVA                                  
----------------------------------------------------------------------
               Sum of                                                 
              Squares          DF    Mean Square       F         Sig. 
----------------------------------------------------------------------
Regression      0.035          11          0.003    131.713    0.0000 
Residual        0.076        3164          0.000                      
Total           0.111        3175                                     
----------------------------------------------------------------------

                                             Parameter Estimates                                              
-------------------------------------------------------------------------------------------------------------
                           model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
-------------------------------------------------------------------------------------------------------------
                     (Intercept)     0.017         0.002                  8.139    0.000     0.013     0.021 
                passport_doc_pct     0.047         0.015        0.049     3.112    0.002     0.017     0.076 
                pay_slip_doc_pct     0.054         0.016        0.053     3.412    0.001     0.023     0.084 
             self_sustaining_pct     0.023         0.008        0.044     2.841    0.005     0.007     0.038 
                  save_money_pct    -0.066         0.011       -0.168    -5.772    0.000    -0.088    -0.043 
  save_money_commercial_bank_pct     0.183         0.011        0.299    17.371    0.000     0.163     0.204 
            save_money_SACCO_pct     0.219         0.010        0.353    21.723    0.000     0.200     0.239 
              LOG_last_amt_saved     0.001         0.000        0.097     3.502    0.000     0.000     0.001 
borrow_money_commercial_bank_pct     0.095         0.023        0.066     4.208    0.000     0.051     0.140 
          borrow_money_SACCO_pct     0.096         0.020        0.075     4.761    0.000     0.056     0.135 
   borrow_money_mobile_money_pct     0.050         0.017        0.044     2.914    0.004     0.016     0.083 
    LOG_distance_commerical_bank     0.001         0.000        0.044     2.937    0.003     0.000     0.001 
-------------------------------------------------------------------------------------------------------------

After added the independent variables which are statistically significant, there are no improvement in the R-Squared and Adjusted R-Squared. Furthermore, Predicted R-Squared (of 0.298) indicating that the model may perform slightly less effectively on unseen data. Both RMSE (of 0.005) and MAE (of 0.002) giving an idea of the typical error magnitude. However, AIC and SIC show lower values indicates a better fit given model complexity.

The regression analysis suggests that several factors significantly influence the dependent variable, with certain predictors like save_money_SACCO_pct and save_money_commercial_bank_pct having particularly strong positive effects, while save_money_pct has a negative impact. The overall model explains a moderate portion of variance in the dependent variable and is statistically significant, but there may be additional variables not included in this analysis that could improve predictive power further.

10. Check for Multicolinearuty

ols_vif_tol(sa_mlr1)
                          Variables Tolerance      VIF
1                  passport_doc_pct 0.8771595 1.140043
2                  pay_slip_doc_pct 0.8958817 1.116219
3               self_sustaining_pct 0.9168856 1.090649
4                    save_money_pct 0.2562338 3.902686
5    save_money_commercial_bank_pct 0.7324681 1.365247
6              save_money_SACCO_pct 0.8216329 1.217089
7                LOG_last_amt_saved 0.2807517 3.561866
8  borrow_money_commercial_bank_pct 0.8723239 1.146363
9            borrow_money_SACCO_pct 0.8660948 1.154608
10    borrow_money_mobile_money_pct 0.9345005 1.070090
11     LOG_distance_commerical_bank 0.9452141 1.057961

All between 1 and 5 suggests moderate correlation, which may not be problematic.

ols_vif_tol(sa_mlr)
                          Variables Tolerance      VIF
1                      LOG_age_band 0.7282275 1.373197
2                        gender_pct 0.9069884 1.102550
3               LOG_education_level 0.6363121 1.571556
4             LOG_employment_status 0.8117636 1.231886
5                   mobile_user_pct 0.6799937 1.470602
6               national_ic_doc_pct 0.7546980 1.325033
7                  passport_doc_pct 0.8306371 1.203895
8             utilities_bill_do_pct 0.7943377 1.258910
9                  pay_slip_doc_pct 0.8613739 1.160936
10              self_sustaining_pct 0.8773222 1.139832
11             financial_advice_pct 0.8385705 1.192506
12                   save_money_pct 0.2275655 4.394339
13   save_money_commercial_bank_pct 0.6768241 1.477489
14             save_money_SACCO_pct 0.7779627 1.285409
15      save_money_mobile_money_pct 0.6927860 1.443447
16               LOG_last_amt_saved 0.2614038 3.825499
17            LOG_last_amt_borrowed 0.7658660 1.305711
18 borrow_money_commercial_bank_pct 0.8631940 1.158488
19           borrow_money_SACCO_pct 0.8483674 1.178735
20    borrow_money_mobile_money_pct 0.8926030 1.120319
21                LOG_last_amt_sent 0.6060137 1.650128
22            LOG_last_amt_received 0.6633877 1.507414
23     LOG_distance_commerical_bank 0.2107685 4.744543
24              LOG_distance_SACCOS 0.5813174 1.720231
25                 LOG_distance_ATM 0.2093546 4.776584
26        LOG_distance_mobile_money 0.7544986 1.325384
27                joint_account_pct 0.9827299 1.017574

All between 1 and 5 suggests moderate correlation, which may not be problematic.

10.1 Test for Non-Linearity

ols_plot_resid_fit(sa_mlr1)

10.2 Variable selection

sa_fw_mlr <- ols_step_forward_p(
  sa_mlr1,
  p_val = 0.05,
  details = FALSE)
plot(sa_fw_mlr)

10.3 Visualising model parameters

ggcoefstats(sa_mlr1,
            sort = "ascending")

10.4 Test for Normality Assumption

The code chunk below uses ols_plot_resid_hist() of olsrr package to perform normality assumption test.

ols_plot_resid_hist(sa_mlr1)

The figure reveals that the residual of the multiple linear regression model (i.e. sa.mlr1) is resemble normal distribution.

For formal statistical test methods, the ols_test_normality() of olsrr package can be used as shown in the code chun below.

ols_test_normality(sa_mlr1)
Warning in ks.test.default(y, "pnorm", mean(y), sd(y)): ties should not be
present for the one-sample Kolmogorov-Smirnov test
-----------------------------------------------
       Test             Statistic       pvalue  
-----------------------------------------------
Shapiro-Wilk              0.5989         0.0000 
Kolmogorov-Smirnov        0.3139         0.0000 
Cramer-von Mises        1052.2548        0.0000 
Anderson-Darling         427.5649        0.0000 
-----------------------------------------------

The summary table above reveals that the p-values of the four tests are way smaller than the alpha value of 0.05. Hence we will reject the null hypothesis and infer that there is statistical evidence that the residual are not normally distributed.

10.5 Testing for Spatial Autocorrelation

In order to perform spatial autocorrelation test, we need to convert uganda_data.sf from sf data frame into a SpatialPointsDataFrame.

mlr.output <- as.data.frame(sa_mlr1$residuals)

Next, we will join the newly created data frame with uganda_data.sf object.

uganda_data.res.sf <- cbind(uganda_data.sf, 
                        sa_mlr1$residuals) %>%
rename(`MLR_RES` = `sa_mlr1.residuals`)

The code chunk below will be used to perform the data conversion process.

uganda_sa.sp <- as_Spatial(uganda_data.res.sf)
uganda_sa.sp
class       : SpatialPointsDataFrame 
features    : 3176 
extent      : -3395506, 718004.1, 9843626, 10407772  (xmin, xmax, ymin, ymax)
crs         : +proj=utm +zone=36 +south +datum=WGS84 +units=m +no_defs 
variables   : 63
names       :  HH_ID,  Region, Rural_Urban, age_band, gender, education_level, employment_status, mobile_user, national_ic_doc, passport_doc, utilities_bill_doc, pay_slip_doc, self_sustaining, financial_advice, save_money, ... 
min values  : 001001, CENTRAL,       Rural,        1,      1,               1,                 1,           1,               1,            1,                  1,            1,               1,                1,          1, ... 
max values  : 321087, WESTERN,       Urban,        7,      2,               9,                99,           2,               2,            2,                  2,            2,               2,                2,          2, ... 

The code churn below will turn on the interactive mode of tmap.

tmap_mode("view")
tmap mode set to interactive viewing

The code chunks below is used to create an interactive point symbol map.

tm_shape(boundaries_cleaned)+
  tmap_options(check.and.fix = TRUE) +
  tm_polygons(alpha = 0.4) +
tm_shape(uganda_data.res.sf) +  
  tm_dots(col = "MLR_RES",
          alpha = 0.6,
          style="quantile") +
  tm_view(set.zoom.limits = c(11,14))
Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting

10.6 Spatial stationary test

First, we will compute the distance-based weight matrix by using dnearneigh() function of spdep.

uganda_data_res_sf <- uganda_data.res.sf %>%
  mutate(nb = st_knn(geometry, k=6,
                     longlat = FALSE),
         wt = st_weights(nb,
                         style = "W"),
         .before = 1)

Next, global_moran_perm() of sfdep is used to perform global Moran permutation test.

global_moran_perm(uganda_data_res_sf$MLR_RES, 
                  uganda_data_res_sf$nb, 
                  uganda_data_res_sf$wt, 
                  alternative = "two.sided", 
                  nsim = 99)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 100 

statistic = 0.016767, observed rank = 95, p-value = 0.1
alternative hypothesis: two.sided

11. Building Hedonic Pricing Models using GWmodel

11.1 Building Fixed Bandwidth GWR Model

In the code chunk below bw.gwr() of GWModel package is used to determine the optimal fixed bandwidth to use in the model. Notice that the argument adaptive is set to FALSE indicates that we are interested to compute the fixed bandwidth.

bw_fixed_sa <- bw.gwr(formula = savings_account_pct ~ passport_doc_pct + 
                        pay_slip_doc_pct  + self_sustaining_pct + save_money_pct + 
                        save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + 
                        borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
                        borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
                    data=uganda_data_res_sf,
                    approach="CV", 
                    kernel="gaussian", 
                    adaptive=FALSE, 
                    longlat=FALSE)
Take a cup of tea and have a break, it will take a few minutes.
          -----A kind suggestion from GWmodel development group
Fixed bandwidth: 2546005 CV score: 0.07780339 
Fixed bandwidth: 1573832 CV score: 0.07782599 

11.2 GWModel method - fixed bandwith

Use the code chunk below to calibrate the gwr model using fixed bandwidth and gaussian kernel.

gwr.fixed_sa <- gwr.basic(formula = savings_account_pct ~ passport_doc_pct + 
                        pay_slip_doc_pct  + self_sustaining_pct + save_money_pct + 
                        save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + 
                        borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
                        borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
                    data = uganda_data_res_sf,
                    bw = bw_fixed_sa, 
                    kernel = "gaussian",
                    longlat = FALSE)

The output is saved in a list of class “gwrm”. The code below can be used to display the model output.

gwr.fixed_sa
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-15 02:10:50.005064 
   Call:
   gwr.basic(formula = savings_account_pct ~ passport_doc_pct + 
    pay_slip_doc_pct + self_sustaining_pct + save_money_pct + 
    save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + 
    borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
    borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
    data = uganda_data_res_sf, bw = bw_fixed_sa, kernel = "gaussian", 
    longlat = FALSE)

   Dependent (y) variable:  savings_account_pct
   Independent variables:  passport_doc_pct pay_slip_doc_pct self_sustaining_pct save_money_pct save_money_commercial_bank_pct save_money_SACCO_pct LOG_last_amt_saved borrow_money_commercial_bank_pct borrow_money_SACCO_pct borrow_money_mobile_money_pct LOG_distance_commerical_bank
   Number of data points: 3176
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
      Min        1Q    Median        3Q       Max 
-0.032961 -0.000829 -0.000054  0.000397  0.017458 

   Coefficients:
                                      Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                       0.0166977  0.0020516   8.139 5.68e-16 ***
   passport_doc_pct                  0.0468483  0.0150517   3.112 0.001872 ** 
   pay_slip_doc_pct                  0.0535268  0.0156865   3.412 0.000652 ***
   self_sustaining_pct               0.0227272  0.0079988   2.841 0.004521 ** 
   save_money_pct                   -0.0656226  0.0113683  -5.772 8.57e-09 ***
   save_money_commercial_bank_pct    0.1834748  0.0105622  17.371  < 2e-16 ***
   save_money_SACCO_pct              0.2193000  0.0100951  21.723  < 2e-16 ***
   LOG_last_amt_saved                0.0005877  0.0001678   3.502 0.000468 ***
   borrow_money_commercial_bank_pct  0.0953930  0.0226693   4.208 2.65e-05 ***
   borrow_money_SACCO_pct            0.0955746  0.0200753   4.761 2.01e-06 ***
   borrow_money_mobile_money_pct     0.0496369  0.0170359   2.914 0.003597 ** 
   LOG_distance_commerical_bank      0.0006512  0.0002217   2.937 0.003334 ** 

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 0.004901 on 3164 degrees of freedom
   Multiple R-squared: 0.3141
   Adjusted R-squared: 0.3117 
   F-statistic: 131.7 on 11 and 3164 DF,  p-value: < 2.2e-16 
   ***Extra Diagnostic information
   Residual sum of squares: 0.07599864
   Sigma(hat): 0.004893273
   AIC:  -24754.87
   AICc:  -24754.75
   BIC:  -27747.22
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Fixed bandwidth: 2546005 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                           Min.     1st Qu.      Median
   Intercept                         0.01624078  0.01626142  0.01627810
   passport_doc_pct                  0.04425666  0.04665337  0.04678180
   pay_slip_doc_pct                  0.04617598  0.05344475  0.05384274
   self_sustaining_pct               0.02110392  0.02360797  0.02361127
   save_money_pct                   -0.06814142 -0.06544134 -0.06534997
   save_money_commercial_bank_pct    0.18247981  0.18272256  0.18289746
   save_money_SACCO_pct              0.21497232  0.22035550  0.22047133
   LOG_last_amt_saved                0.00057433  0.00057496  0.00057544
   borrow_money_commercial_bank_pct  0.09597604  0.09797034  0.09822798
   borrow_money_SACCO_pct            0.08112327  0.09800867  0.09871645
   borrow_money_mobile_money_pct     0.04889972  0.04911912  0.04928691
   LOG_distance_commerical_bank      0.00062156  0.00062261  0.00062350
                                        3rd Qu.    Max.
   Intercept                         0.01629940  0.0179
   passport_doc_pct                  0.04688121  0.0470
   pay_slip_doc_pct                  0.05410534  0.0545
   self_sustaining_pct               0.02361922  0.0236
   save_money_pct                   -0.06527789 -0.0652
   save_money_commercial_bank_pct    0.18312609  0.1887
   save_money_SACCO_pct              0.22054487  0.2206
   LOG_last_amt_saved                0.00057604  0.0006
   borrow_money_commercial_bank_pct  0.09862224  0.0993
   borrow_money_SACCO_pct            0.09911163  0.0999
   borrow_money_mobile_money_pct     0.04950421  0.0548
   LOG_distance_commerical_bank      0.00062480  0.0007
   ************************Diagnostic information*************************
   Number of data points: 3176 
   Effective number of parameters (2trace(S) - trace(S'S)): 13.15078 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 3162.849 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): -24755.9 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): -24770.66 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): -27857.41 
   Residual sum of squares: 0.07594065 
   R-square value:  0.3146122 
   Adjusted R-square value:  0.3117615 

   ***********************************************************************
   Program stops at: 2024-11-15 02:10:54.891171 

The report shows that the AICc of the gwr is -24754.87 which is slightly smaller than the global multiple linear regression model of -24754.869.

11.3 Building Adaptive Bandwidth GWR Model

Calibrate the gwr-based hedonic pricing model by using adaptive bandwidth approach.

11.3.1 Computing the adaptive bandwidth

Use bw.gwr() to determine the recommended data point to use.

The code chunk used look very similar to the one used to compute the fixed bandwidth except the adaptive argument has changed to TRUE.

bw.adaptive_sa <- bw.gwr(formula = savings_account_pct ~ passport_doc_pct + 
                        pay_slip_doc_pct  + self_sustaining_pct + save_money_pct + 
                        save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + 
                        borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
                        borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
                    data=uganda_data_res_sf,
                    approach="CV", 
                    kernel="gaussian", 
                    adaptive=TRUE, 
                    longlat=FALSE)
Take a cup of tea and have a break, it will take a few minutes.
          -----A kind suggestion from GWmodel development group
Adaptive bandwidth: 1970 CV score: 0.07684179 
Adaptive bandwidth: 1225 CV score: 0.07640372 
Adaptive bandwidth: 764 CV score: 0.07639378 

The result shows that the 764 is the recommended data points to be used.

11.3.2 Constructing the adaptive bandwidth gwr model

gwr_adaptive_sa <- gwr.basic(formula = savings_account_pct ~ passport_doc_pct + 
                        pay_slip_doc_pct  + self_sustaining_pct + save_money_pct + 
                        save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + 
                        borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
                        borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
                  data=uganda_data.sf,
                  bw=bw.adaptive_sa, 
                  kernel = 'gaussian', 
                  adaptive=TRUE, 
                  longlat = FALSE)

The code below can be used to display the model output.

gwr_adaptive_sa
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-15 02:11:01.02247 
   Call:
   gwr.basic(formula = savings_account_pct ~ passport_doc_pct + 
    pay_slip_doc_pct + self_sustaining_pct + save_money_pct + 
    save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + 
    borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + 
    borrow_money_mobile_money_pct + LOG_distance_commerical_bank, 
    data = uganda_data.sf, bw = bw.adaptive_sa, kernel = "gaussian", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  savings_account_pct
   Independent variables:  passport_doc_pct pay_slip_doc_pct self_sustaining_pct save_money_pct save_money_commercial_bank_pct save_money_SACCO_pct LOG_last_amt_saved borrow_money_commercial_bank_pct borrow_money_SACCO_pct borrow_money_mobile_money_pct LOG_distance_commerical_bank
   Number of data points: 3176
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
      Min        1Q    Median        3Q       Max 
-0.032961 -0.000829 -0.000054  0.000397  0.017458 

   Coefficients:
                                      Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                       0.0166977  0.0020516   8.139 5.68e-16 ***
   passport_doc_pct                  0.0468483  0.0150517   3.112 0.001872 ** 
   pay_slip_doc_pct                  0.0535268  0.0156865   3.412 0.000652 ***
   self_sustaining_pct               0.0227272  0.0079988   2.841 0.004521 ** 
   save_money_pct                   -0.0656226  0.0113683  -5.772 8.57e-09 ***
   save_money_commercial_bank_pct    0.1834748  0.0105622  17.371  < 2e-16 ***
   save_money_SACCO_pct              0.2193000  0.0100951  21.723  < 2e-16 ***
   LOG_last_amt_saved                0.0005877  0.0001678   3.502 0.000468 ***
   borrow_money_commercial_bank_pct  0.0953930  0.0226693   4.208 2.65e-05 ***
   borrow_money_SACCO_pct            0.0955746  0.0200753   4.761 2.01e-06 ***
   borrow_money_mobile_money_pct     0.0496369  0.0170359   2.914 0.003597 ** 
   LOG_distance_commerical_bank      0.0006512  0.0002217   2.937 0.003334 ** 

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 0.004901 on 3164 degrees of freedom
   Multiple R-squared: 0.3141
   Adjusted R-squared: 0.3117 
   F-statistic: 131.7 on 11 and 3164 DF,  p-value: < 2.2e-16 
   ***Extra Diagnostic information
   Residual sum of squares: 0.07599864
   Sigma(hat): 0.004893273
   AIC:  -24754.87
   AICc:  -24754.75
   BIC:  -27747.22
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Adaptive bandwidth: 764 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                           Min.     1st Qu.      Median
   Intercept                         0.01100786  0.01328636  0.01456606
   passport_doc_pct                  0.02561207  0.03882329  0.04702325
   pay_slip_doc_pct                 -0.01638144  0.00222473  0.04980080
   self_sustaining_pct               0.00464580  0.01102064  0.01799666
   save_money_pct                   -0.09221844 -0.07057591 -0.05649395
   save_money_commercial_bank_pct    0.11863544  0.13547024  0.17265364
   save_money_SACCO_pct              0.19076097  0.21804899  0.23073936
   LOG_last_amt_saved                0.00035383  0.00047877  0.00051201
   borrow_money_commercial_bank_pct -0.01082834  0.03301703  0.08165212
   borrow_money_SACCO_pct            0.00749615  0.08501548  0.11901933
   borrow_money_mobile_money_pct     0.01177161  0.02741277  0.05205928
   LOG_distance_commerical_bank      0.00029778  0.00038608  0.00051759
                                        3rd Qu.    Max.
   Intercept                         0.01569714  0.0181
   passport_doc_pct                  0.05642495  0.0828
   pay_slip_doc_pct                  0.10485771  0.1731
   self_sustaining_pct               0.02399954  0.0415
   save_money_pct                   -0.04559956 -0.0353
   save_money_commercial_bank_pct    0.21462934  0.2431
   save_money_SACCO_pct              0.26313271  0.3168
   LOG_last_amt_saved                0.00054552  0.0007
   borrow_money_commercial_bank_pct  0.14267422  0.2249
   borrow_money_SACCO_pct            0.14935916  0.2099
   borrow_money_mobile_money_pct     0.08735646  0.1175
   LOG_distance_commerical_bank      0.00066291  0.0008
   ************************Diagnostic information*************************
   Number of data points: 3176 
   Effective number of parameters (2trace(S) - trace(S'S)): 42.64019 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 3133.36 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): -24895.67 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): -24929.82 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): -27883.62 
   Residual sum of squares: 0.07180195 
   R-square value:  0.3519653 
   Adjusted R-square value:  0.3431437 

   ***********************************************************************
   Program stops at: 2024-11-15 02:11:06.529093 

11.3.4 Visualising GWR Output

To visualise the fields in SDF, we need to first covert it into sf data.frame by using the code chunk below.

gwr_adaptive_output <- as.data.frame(
  gwr_adaptive_sa$SDF) %>%
  select(-c(2:15))
gwr_sf_adaptive <- cbind(uganda_data.sf,
                         gwr_adaptive_output)

Next, glimpse() is used to display the content of uganda_data.sf.adaptive sf data frame.

glimpse(gwr_sf_adaptive)
Rows: 3,176
Columns: 92
$ HH_ID                               <chr> "001001", "001019", "001028", "001…
$ Region                              <chr> "NORTHERN", "NORTHERN", "NORTHERN"…
$ Rural_Urban                         <chr> "Urban", "Urban", "Urban", "Urban"…
$ age_band                            <dbl> 4, 4, 3, 4, 4, 1, 4, 5, 3, 2, 2, 4…
$ gender                              <dbl> 2, 2, 2, 1, 2, 1, 2, 1, 1, 2, 2, 1…
$ education_level                     <dbl> 6, 2, 1, 2, 3, 2, 3, 6, 2, 2, 3, 6…
$ employment_status                   <dbl> 1, 5, 5, 1, 4, 9, 1, 7, 1, 5, 5, 4…
$ mobile_user                         <dbl> 2, 1, 2, 2, 1, 2, 1, 2, 1, 1, 1, 1…
$ national_ic_doc                     <dbl> 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1…
$ passport_doc                        <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1…
$ utilities_bill_doc                  <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ pay_slip_doc                        <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ self_sustaining                     <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ financial_advice                    <dbl> 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1…
$ save_money                          <dbl> 1, 1, 2, 2, 1, 2, 1, 1, 2, 1, 2, 2…
$ save_money_commercial_bank          <dbl> 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2…
$ save_money_SACCO                    <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ save_money_mobile_money             <dbl> 1, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2…
$ last_amt_saved                      <dbl> 4, 1, 9, 9, 1, 9, 1, 8, 9, 1, 9, 9…
$ last_amt_borrowed                   <dbl> 2, 1, 998, 998, 1, 998, 2, 998, 1,…
$ borrow_money_commercial_bank        <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2…
$ borrow_money_SACCO                  <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ borrow_money_mobile_money           <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2…
$ last_amt_sent                       <dbl> 3, 1, 998, 998, 1, 998, 1, 2, 1, 9…
$ last_amt_received                   <dbl> 1, 1, 998, 998, 998, 998, 998, 997…
$ own_insurance                       <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ distance_commerical_bank            <dbl> 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 4, 4…
$ distance_SACCOS                     <dbl> 2, 2, 1, 2, 1, 1, 1, 2, 1, 2, 3, 2…
$ distance_ATM                        <dbl> 2, 2, 2, 4, 1, 1, 1, 2, 2, 2, 4, 4…
$ distance_mobile_money               <dbl> 1, 2, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2…
$ savings_account                     <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ joint_account                       <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ county_name                         <chr> "LABWOR", "LABWOR", "LABWOR", "LAB…
$ LOG_age_band                        <dbl> 1.3862944, 1.3862944, 1.0986123, 1…
$ gender_pct                          <dbl> 0.06297229, 0.06297229, 0.06297229…
$ LOG_education_level                 <dbl> 1.7917595, 0.6931472, 0.0000000, 0…
$ LOG_employment_status               <dbl> 0.0000000, 1.6094379, 1.6094379, 0…
$ mobile_user_pct                     <dbl> 0.06297229, 0.03148615, 0.06297229…
$ national_ic_doc_pct                 <dbl> 0.03148615, 0.03148615, 0.03148615…
$ passport_doc_pct                    <dbl> 0.06297229, 0.06297229, 0.06297229…
$ utilities_bill_do_pct               <dbl> 0.06297229, 0.06297229, 0.06297229…
$ pay_slip_doc_pct                    <dbl> 0.06297229, 0.06297229, 0.06297229…
$ self_sustaining_pct                 <dbl> 0.06297229, 0.06297229, 0.06297229…
$ financial_advice_pct                <dbl> 0.03148615, 0.03148615, 0.06297229…
$ save_money_pct                      <dbl> 0.03148615, 0.03148615, 0.06297229…
$ save_money_commercial_bank_pct      <dbl> 0.06297229, 0.06297229, 0.06297229…
$ save_money_SACCO_pct                <dbl> 0.06297229, 0.03148615, 0.06297229…
$ save_money_mobile_money_pct         <dbl> 0.03148615, 0.06297229, 0.06297229…
$ LOG_last_amt_saved                  <dbl> 1.386294, 0.000000, 2.197225, 2.19…
$ LOG_last_amt_borrowed               <dbl> 0.6931472, 0.0000000, 6.9057533, 6…
$ borrow_money_commercial_bank_pct    <dbl> 0.06297229, 0.06297229, 0.06297229…
$ borrow_money_SACCO_pct              <dbl> 0.06297229, 0.06297229, 0.06297229…
$ borrow_money_mobile_money_pct       <dbl> 0.06297229, 0.06297229, 0.06297229…
$ LOG_last_amt_sent                   <dbl> 1.0986123, 0.0000000, 6.9057533, 6…
$ LOG_last_amt_received               <dbl> 0.000000, 0.000000, 6.905753, 6.90…
$ own_insurance_pct                   <dbl> 0.06297229, 0.06297229, 0.06297229…
$ LOG_distance_commerical_bank        <dbl> 0.6931472, 0.6931472, 0.6931472, 0…
$ LOG_distance_SACCOS                 <dbl> 0.6931472, 0.6931472, 0.0000000, 0…
$ LOG_distance_ATM                    <dbl> 0.6931472, 0.6931472, 0.6931472, 1…
$ LOG_distance_mobile_money           <dbl> 0.0000000, 0.6931472, 1.0986123, 0…
$ savings_account_pct                 <dbl> 0.06297229, 0.06297229, 0.06297229…
$ joint_account_pct                   <dbl> 0.06297229, 0.06297229, 0.06297229…
$ Intercept                           <dbl> 0.01374018, 0.01374214, 0.01373844…
$ CV_Score                            <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Stud_residual                       <dbl> -0.1563762060, 1.4451162350, 0.059…
$ Intercept_SE                        <dbl> 0.002583321, 0.002583638, 0.002585…
$ passport_doc_pct_SE                 <dbl> 0.01850887, 0.01851101, 0.01852315…
$ pay_slip_doc_pct_SE                 <dbl> 0.02009748, 0.02010038, 0.02011371…
$ self_sustaining_pct_SE              <dbl> 0.01040533, 0.01040718, 0.01041443…
$ save_money_pct_SE                   <dbl> 0.01375069, 0.01375226, 0.01376000…
$ save_money_commercial_bank_pct_SE   <dbl> 0.01342778, 0.01342961, 0.01343743…
$ save_money_SACCO_pct_SE             <dbl> 0.01397570, 0.01397852, 0.01399230…
$ LOG_last_amt_saved_SE               <dbl> 0.0002015484, 0.0002015690, 0.0002…
$ borrow_money_commercial_bank_pct_SE <dbl> 0.02771849, 0.02772048, 0.02773078…
$ borrow_money_SACCO_pct_SE           <dbl> 0.02956245, 0.02956804, 0.02959599…
$ borrow_money_mobile_money_pct_SE    <dbl> 0.02012100, 0.02012268, 0.02013044…
$ LOG_distance_commerical_bank_SE     <dbl> 0.0002695990, 0.0002696291, 0.0002…
$ Intercept_TV                        <dbl> 5.318805, 5.318912, 5.313990, 5.30…
$ passport_doc_pct_TV                 <dbl> 4.283297, 4.282622, 4.283510, 4.28…
$ pay_slip_doc_pct_TV                 <dbl> 6.106481, 6.106079, 6.109939, 6.11…
$ self_sustaining_pct_TV              <dbl> 2.044626, 2.043120, 2.041662, 2.04…
$ save_money_pct_TV                   <dbl> -3.376885, -3.375175, -3.369971, -…
$ save_money_commercial_bank_pct_TV   <dbl> 9.783221, 9.779953, 9.767750, 9.75…
$ save_money_SACCO_pct_TV             <dbl> 15.48776, 15.48864, 15.47912, 15.4…
$ LOG_last_amt_saved_TV               <dbl> 2.643905, 2.642684, 2.639057, 2.63…
$ borrow_money_commercial_bank_pct_TV <dbl> 0.3386409, 0.3377677, 0.3331410, 0…
$ borrow_money_SACCO_pct_TV           <dbl> 6.741598, 6.739035, 6.732029, 6.72…
$ borrow_money_mobile_money_pct_TV    <dbl> 0.8783295, 0.8780371, 0.8753335, 0…
$ LOG_distance_commerical_bank_TV     <dbl> 2.582395, 2.582094, 2.581151, 2.57…
$ Local_R2                            <dbl> 0.3668502, 0.3668682, 0.3669406, 0…
$ geometry                            <POINT [m]> POINT (572712.4 10295984), P…
$ geometry.1                          <POINT [m]> POINT (572712.4 10295984), P…
summary(gwr_adaptive_sa$SDF$yhat)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.03960 0.06251 0.06308 0.06184 0.06352 0.06540 

11.4 Visualising local R2

The code chunks below is used to create an interactive point symbol map.

tmap_mode("view")
tmap mode set to interactive viewing
tmap_options(check.and.fix = TRUE)
tm_shape(boundaries_sf)+
  tm_polygons(alpha = 0.1) +
tm_shape(gwr_sf_adaptive) +  
  tm_dots(col = "Local_R2",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(11,14))
tmap_mode("plot")
tmap mode set to plotting

11.5 Visualising coefficient estimates

The code chunks below is used to create an interactive point symbol map.

tmap_mode("view")
tmap mode set to interactive viewing
passport_doc_pct_SE <- tm_shape(boundaries_cleaned)+
  tm_polygons(alpha = 0.1) +
tm_shape(gwr_sf_adaptive) +  
  tm_dots(col = "passport_doc_pct_SE",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(11,14))

passport_doc_pct_TV <- tm_shape(boundaries_cleaned)+
  tm_polygons(alpha = 0.1) +
tm_shape(gwr_sf_adaptive) +  
  tm_dots(col = "passport_doc_pct_TV",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(11,14))

tmap_arrange(passport_doc_pct_SE, passport_doc_pct_TV, 
             asp=1, ncol=2,
             sync = TRUE)
tmap_mode("plot")
tmap mode set to plotting